A Systematic Review of Fault Prediction Performance in Software Engineering
نویسندگان
چکیده
BACKGROUND – The accurate prediction of where faults are likely to occur in code is important since it can help direct test effort, reduce costs and improve the quality of software. As a consequence, many different fault prediction models have been developed and reported in the literature. However, there is no consensus on what constitutes effective fault prediction. OBJECTIVE – To investigate how the context of models, the independent variables used and the modelling techniques applied influence the performance of fault prediction models. METHOD – A systematic literature review identifying 208 fault prediction studies published from January 2000 to December 2010. A synthesis of the quantitative and qualitative results of those 35 studies which report sufficient contextual and methodological information according the criteria we develop and apply. This synthesis includes a detailed analysis of the relative predictive performances of 203 models (or model variants) reported in 18 of these 35 studies which allow us to calculate precision, recall and f-measure. RESULTS –There are large variations in the performance of these 203 models. The models that perform well tend to be based on simple modelling techniques such as Naïve Bayes or Logistic Regression. Combinations of independent variables have been used by models that perform well. These combinations include product, process and people metrics. Feature selection has been applied to these combinations by models which are performing particularly well. In addition, such models tend to have been trained on large data sets which are rich with faults. CONCLUSION – The methodology used to build models seems to be influential to predictive performance. Although there are a set of fault prediction studies in which confidence is possible, many open questions remain about effective fault prediction. More studies are needed that use a reliable methodology and which report their context, methodology and performance comprehensively. This would enable a meta-analysis across more studies. It would also produce models more likely to be used by industry.
منابع مشابه
A Systematic Review of Fault Prediction approaches used in Software Engineering
BACKGROUND – The accurate prediction of where faults are likely to occur in code is important because it can help direct test effort, reduce costs and improve the quality of software. OBJECTIVE – To summarise and analyse the published fault prediction studies in order to identify approaches used to build, measure and validate the performance of fault prediction models. METHOD – A systematic lit...
متن کاملEvaluation of Classifiers in Software Fault-Proneness Prediction
Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...
متن کاملA Systematic Literature Review on Software Fault Prediction based on Qualitative and Quantitative Factors
The growing demand for higher operational effectiveness and reliability in industrial processes has resulted in a huge attention in fault detection techniques. Researcher and practitioners are remains concerned with correct prediction when developing systems. On the other hand the most popular research area is software fault or fault prediction. Software fault prediction has both security and f...
متن کاملSoftware Fault Prediction: A Systematic Mapping Study
Context: Software fault prediction has been an important research topic in the software engineering field for more than 30 years. Software defect prediction models are commonly used to detect faulty software modules based on software metrics collected during the software development process. Objective: Data mining techniques and machine learning studies in the fault prediction software context ...
متن کاملOn the application of genetic programming for software engineering predictive modeling: A systematic review
0957-4174/$ see front matter 2011 Elsevier Ltd. A doi:10.1016/j.eswa.2011.03.041 ⇑ Corresponding author. Tel.: +46 455 385840; fax: E-mail addresses: [email protected] (W. Afza Torkar). The objective of this paper is to investigate the evidence for symbolic regression using genetic programming (GP) being an effective method for prediction and estimation in software engineering, when compared w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011